Extracting Data Records from Unstructured Biomedical Full Text
نویسندگان
چکیده
In this paper, we address the problem of extracting data records and their attributes from unstructured biomedical full text. There has been little effort reported on this in the research community. We argue that semantics is important for record extraction or finer-grained language processing tasks. We derive a data record template including semantic language models from unstructured text and represent them with a discourse level Conditional Random Fields (CRF) model. We evaluate the approach from the perspective of Information Extraction and achieve significant improvements on system performance compared with other baseline systems.
منابع مشابه
A Review of Towered Big-Data Service Model for Biomedical Text-Mining Databases
The rapid growth of biomedical informatics has drawn increasing popularity and attention. The reason behind this are the advances in genomic, new molecular, biomedical approaches and various applications like protein identification, patient medical records, genome sequencing, medical imaging and a huge set of biomedical research data are being generated day to day. The increase of biomedical da...
متن کاملUnstructured Data Integration through Automata-Driven Information Extraction
Extracting information from plain text and restructuring them into relational databases raise a challenge as how to locate relevant information and update database records accordingly. In this paper, we propose a wrapper to efficiently extract information from unstructured documents, containing plain text expressed with natural-like language. Our extraction approach is based on the automata for...
متن کاملUsing Text Analytics to Derive Customer Service Management Benefits from Unstructured Data
The Growth of Text Analytics1 Estimates suggest that about 80% of today’s enterprise data is unstructured.2 Unlike structured data, which is tidy and mostly numeric, unstructured data is often textual and, therefore, messy. Unstructured data comprises documents, emails, instant messages or user posts and comments on social media, and presents a challenge to data miners; analyzing unstructured d...
متن کاملBiomedical Text Mining Using a Grid Computing Approach
Extracting useful information from a very large amount of biomedical texts is an important and difficult activity in biomedicine field. Data to be examined are generally unstructured and the available computational resources do not still provide adequate mechanisms for retrieving and analyse very large amount of contents. In this paper we present a rule-based system for Text Mining process appl...
متن کاملA hybrid solution for extracting structured medical information from unstructured data in medical records via a double-reading/entry system
BACKGROUND Healthcare providers generate a huge amount of biomedical data stored in either legacy system (paper-based) format or electronic medical records (EMR) around the world, which are collectively referred to as big biomedical data (BBD). To realize the promise of BBD for clinical use and research, it is an essential step to extract key data elements from unstructured medical records into...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007